Linear transformation approach to VTLN using dynamic frequency warping
نویسندگان
چکیده
In the paper, we present a novel linear transformation approach to frequency warping during vocal tract length normalisation(VTLN) using the idea of dynamic frequency warping(DFW). Linear transformation among the mel-frequency cepstral coefficients (MFCC) provides computational advantage of not having to recompute features for each warp factor in VTLN. The proposed method uses the idea of separating the smoothing and the frequency warping operations in the feature extraction stage unlike the conventional approach where both operations are integrated into the filter-bank operation. The advantage of the proposed DFW approach is that, we can obtain a transformation matrix for any arbitrary warping even when we do not know the functional form or mapping of the warping function. We compare the performance of the proposed method along with approaches proposed in [4] and [5] on one phone classification and two digit recognition tasks.
منابع مشابه
Implementing frequency-warping and VTLN through linear transformation of conventional MFCC
In this paper, we show that frequency-warping (including VTLN) can be implemented through linear transformation of conventional MFCC. Unlike the Pitz-Ney [1] continuous domain approach, we directly determine the relation between frequency-warping and the linear-transformation in the discrete-domain. The advantage of such an approach is that it can be applied to any frequency-warping and is not ...
متن کاملFrequency warping for VTLN and speaker adaptation by linear transformation of standard MFCC
Vocal Tract Length Normalization (VTLN) for standard filterbank-based Mel Frequency Cepstral Coefficient (MFCC) features is usually implemented by warping the center frequencies of the Mel filterbank, and the warping factor is estimated using the maximum likelihood score (MLS) criterion (Lee and Rose, 1998). A linear transform (LT) equivalent for frequency warping (FW) would enable more efficie...
متن کاملMLLR-like speaker adaptation based on linearization of VTLN with MFCC features
In this paper, an MLLR-like adaptation approach is proposed whereby the transformation of the means is performed deterministically based on linearization of VTLN. Biases and adaptation of the variances are estimated statistically by the EM algorithm. In the discrete frequency domain, we show that under certain approximations, frequency warping with Mel-£lterbank-based MFCCs equals a linear tran...
متن کاملFrequency warping by linear transformation of standard MFCC
A novel linear transform (LT) is proposed for frequency warping (FW) with standard filterbank based MFCC features. Here, we use the idea of spectral interpolation of [9] to perform a continuous warping in the log filterbank output domain, and incorporate both interpolation and warping into a single warped IDCT matrix. The new transformation matrix is thus mathematically simpler than in [9], and...
متن کاملLinear discriminant - a new criterion for speaker normalization
In Vocal Tract Length Normalization (VTLN) a linear or nonlinear frequency transformation compensates for different vocal tract lengths. Finding good estimates for the speaker specific warp parameters is a critical issue. Despite good results using the Maximum Likelihood criterion to find parameters for a linear warping, there are concerns using this method. We searched for a new criterion that...
متن کامل